SAS and Hadoop—living in the same house

5

hadoop-topo1So, with the simple introduction in Understanding Hadoop security, configuring Kerberos with Hadoop alone looks relatively straightforward. Your Hadoop environment sits in isolation within a separate, independent Kerberos realm with its own Kerberos Key Distribution Center. End users can happily type commands as they log into a machine hosting the Hadoop clients. From the host machine they can run processing against the Hadoop services.

But how does SAS fit into this picture? Where will the SAS servers and clients be located in relation to the Hadoop Kerberos realm? This post provides more insight into second of the four key practices for securing a SAS-Hadoop environment:

Simplify Kerberos setup by placing SAS and Hadoop within the same topological realm.

After reading this next blog post, a coworker told me botanists have a term that fits this concept perfectly: monoecious, from the Greek meaning “one household”. Some trees like hollies and ginkos have male and female flowers on separate plants, but for most plants, the connections of life are made much simpler by being monoecious, by ensuring the important elements are in close proximity. Here’s why that works for SAS-Hadoop-Kerberos too!

What happens if SAS and Hadoop are in different realms

It’s unlikely that many SAS and Hadoop environments will be installed at the same time.  Often one or more already exists. If you have an SAS existing environment in your corporate realm and you’ve just followed the instructions from your Hadoop provider for configuring Kerberos, you’ll probably have the setup in Figure 1.  SAS server and user authentication will happen in the corporate realm, while access to the Hadoop realm is governed by the Kerberos Key Definition Center and will happen in the Hadoop realm.

However, the major thing missing from the customer’s environment is reflected in the green arrow at the top. In the diagram below, the Corporate Domain and the new Hadoop Realm contain the trust relationships. A domain administrator must create these trusts by mapping users between the two realms.    Without one-way trust, SAS is not going to be able to interact with Hadoop at all. This topology will be one of the more complex arrangements. SAS administrators and their IT departments will need to set up all the required domain trusts represented by that little green arrow.

Once trusts are established, there are additional steps to ensure back-end Kerberos authentication for SAS processes running in the Corporate Realm. Ideally, to access Hadoop Services while running SAS processes, the operating system should be configured to perform the kinit step to obtain the correct Ticket Granting Ticket (TGT). Unless the operating system is given this capability, the SAS processes will be unable to request the Service Ticket and so will be unable to authenticate.

The simplest option for SAS administrators is to perform this step on the host running the SAS process as part of the session initialization. In this instance, the SAS session will be launched normally. For example, within an Enterprise Guide session, the end-user still enter a valid user name and password into the connection profile. This action sets up a back-end Kerberos authentication between the SAS process and the Hadoop Services.

hadoop-topo2

Placing SAS and Hadoop in the same realm

Now an alternative to setting up the domain trusts above would be to move the SAS Servers and SAS High Performance Analytics nodes into the same “household” as the Hadoop Key Distribution Center, as shown here in Figure 2. In this configuration, the end-user logs into the corporate realm and launches a SAS session by entering a user name and password into a SAS client. The same credentials used to start SAS Enterprise Guide, for example, are also valid in the Hadoop realm.

Authentication now takes place in the joint SAS-Hadoop realm without additional mapping required. The SAS servers and SAS High-Performance Analytics nodes can interact with the same Kerberos Key Distribution Center as the Hadoop services because all the components are within the same Kerberos realm.

This topology will greatly simplify the Kerberos setup for the SAS components. The Kerberos authentication within the Hadoop Realm will be straightforward, and the only complexity will be if the customer has a requirement for end-to-end Kerberos authentication in which the SAS session itself is launched using Kerberos and Kerberos authentication from the user’s desktop through to the Hadoop services.

hadoop-topo3

 

Where to find more information

SAS provides architecture documents that offer guidelines for ensuring your SAS-Hadoop environment is not only secure, but also offers faster response times.

Share

About Author

Stuart Rogers

Architecture and Security Lead

Stuart Rogers is a Architecture and Security Lead in the Global Enablement and Learning (GEL) Team within SAS R&D's Global Technical Enablement Division. His areas of focus include the SAS Middle Tier and security authentication.

5 Comments

    • Stuart Rogers
      Stuart Rogers on

      Hello,
      Thank you for taking the time to comment on the blog. I would recommend that you open a Technical Support track to best assist you with the changes you will need to make to get cross realm authentication operating correctly. I would also recommend looking at this SAS Global Forum paper on Kerberos Cross-Realm Authentication. If you are using SAS 9.4 you could also look at this SAS Global Forum paper on SAS 9.4. Alternatively, if you are using SAS Viya look at this paper.

      Thank you for your time.
      Stuart

  1. Hi Stuart,

    I trust all well from your end.

    Could you please advise me the best option of cross domain connectivity from SAS and Hadoop.

    SAS is sitting on Intranet domain... Hadoop is sitting on Corp domain. Now we need to connect Hadoop from SAS.

    What changes do we need to do on Krb5 file please.

    Kindly help.

    Thanks

  2. Pingback: SAS and secure Hadoop: 3 deployment requirements - SAS Users

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top